智能论文笔记

Profiling Television Watching Behaviour Using Bayesian Hierarchical Joint Models for Time-to-Event and Count Data

Rafael A. Moral , Zhi Chen , Shuai Zhang , Sally McClean , Gabriel R. Palma , Brahim Allan , Ian Kegel

分类： (统计)机器学习

2022-09-06

在许多行业中，客户流失预测是一项宝贵的任务。在电信中，鉴于数据的高维度以及确定潜在的挫败感签名是多么困难，这可能代表了关于未来流失行为的重要驱动因素。在这里，我们提出了一个新颖的贝叶斯分层联合模型，该模型能够根据不同电视观看旅程中发生的事件以及事件之间需要多长时间来表征客户资料。该模型大幅度地将数据的维度从每个客户的数千个观察值降低到11个客户级参数估计和随机效果。我们使用来自40个BT客户（有20名活跃和20名最终取消订阅的20人）的数据测试我们的方法，他们的电视观看行为是从2019年10月到2019年12月的，总计约为半百万。使用贝叶斯分层模型的参数估计和随机效应采用不同的机器学习技术，作为在验证中与100 \％真实的正率和14 \％的假正率相关的最高92 \％精度可预测流失的精度放。我们提出的方法是降低数据维度的有效方法，同时保持了高描述性和预测能力。我们提供代码以在https://github.com/rafamoral/profiling_tv_watching_behaviour上实现贝叶斯模型。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Designing Ecosystems of Intelligence from First Principles

Karl J Friston , Maxwell J D Ramstead , Alex B Kiefer , Alexander Tschantz , Christopher L Buckley , Mahault Albarracin , Riddhi J Pitliya , Conor Heins , Brennan Klein , Beren Millidge

分类：人工智能

2022-12-02

This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants$\unicode{x2014}$what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world$\unicode{x2014}$also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing$\unicode{x2014}$leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first$\unicode{x2014}$and key$\unicode{x2014}$step towards such an ecology.

translated by 谷歌翻译

First principles physics-informed neural network for quantum wavefunctions and eigenvalue surfaces

Marios Mattheakis , Gabriel R. Schleder , Daniel Larson , Efthimios Kaxiras

分类：机器学习

2022-11-08

Physics-informed neural networks have been widely applied to learn general parametric solutions of differential equations. Here, we propose a neural network to discover parametric eigenvalue and eigenfunction surfaces of quantum systems. We apply our method to solve the hydrogen molecular ion. This is an ab-initio deep learning method that solves the Schrodinger equation with the Coulomb potential yielding realistic wavefunctions that include a cusp at the ion positions. The neural solutions are continuous and differentiable functions of the interatomic distance and their derivatives are analytically calculated by applying automatic differentiation. Such a parametric and analytical form of the solutions is useful for further calculations such as the determination of force fields.

translated by 谷歌翻译

Solving the Online Assignment Problem with Machine Learned Advice

Clarence Gabriel R. Kasilag , Pollux M. Rey , Jhoirene B. Clemente

分类：机器学习

2022-08-08

在线作业问题在运营研究和计算机科学中起着重要作用，这就是为什么要引起了提高其解决方案质量的极大关注的原因。由于有关输入的不完整信息，在线算法很难产生最佳解决方案。使用竞争比率测量在线算法的解决方案的质量。没有在线确定性算法可以比（2N-1）更好地实现竞争比率。已经表明，在线计算中的建议改善了在线问题的竞争比率的下限。在线计算中的建议可以解释为在线算法的其他信息，以补偿缺乏有关整个输入序列的信息。在这项研究中，我们研究了引入机器学习建议如何改善此问题的竞争比率。通过模拟机器学习算法，我们为在线分配问题提供了在线算法，该算法预先预测了整个输入。我们利用一种最佳离线算法来提供预测输入的匹配解决方案。此外，我们研究了机器学习的预测错误如何影响在线算法的竞争比率。我们利用基准数据集来执行我们的经验分析。我们表明，随着机器学习预测误差的增加，解决方案质量会降低。此外，误差的大小与输入的大小成正比。该结果类似于在线分配问题最佳确定性算法的竞争比率，该算法也取决于参数n。

translated by 谷歌翻译

No Language Left Behind: Scaling Human-Centered Machine Translation

NLLB team , Marta R. Costa-jussà , James Cross , Onur Çelebi , Maha Elbayad , Kenneth Heafield , Kevin Heffernan , Elahe Kalbassi , Janice Lam , Daniel Licht

分类：自然语言处理 | 人工智能

2022-07-11

在全球范围内消除语言障碍的目标的驱动下，机器翻译已巩固自己是当今人工智能研究的关键重点。但是，这样的努力围绕着一小部分语言结合在一起，留下了绝大多数低资源的语言。在确保安全，高质量的结果的同时，在牢记道德考虑的同时，打破200个语言障碍需要什么？没有留下的语言，我们首先通过与母语人士的探索性访谈来解决对低资源语言翻译支持的必要性来应对这一挑战。然后，我们创建了旨在缩小低资源和高资源语言之间的性能差距的数据集和模型。更具体地说，我们开发了一种有条件的计算模型，基于专家的稀疏混合物，该模型经过针对针对低资源语言量身定制的新颖有效的数据挖掘技术培训的。我们提出了多次建筑和培训改进，以抵消数千个任务的培训。至关重要的是，我们使用人类翻译的基准，Flores-200评估了40,000多种不同的翻译方向的性能，并将人类评估与新型毒性基准相结合，涵盖Flores-200的所有语言，以评估翻译安全性。我们的模型相对于先前的最新技术，实现了44％BLEU的改善，为实现通用翻译系统奠定了重要的基础。最后，我们开源此工作中描述的所有贡献，可在https://github.com/facebookresearch/fairseq/tree/nllb上访问。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

Automated analysis of fibrous cap in intravascular optical coherence tomography images of coronary arteries

Juhwan Lee , Gabriel T. R. Pereira , Yazan Gharaibeh , Chaitanya Kolluru , Vladislav N. Zimin , Luis A. P. Dallan , Justin N. Kim , Ammar Hoori , Sadeer G. Al-Kindi , Giulio Guagliumi

分类：机器学习 | 计算机视觉

2022-04-21

Thin-cap fibroatheroma (TCFA) and plaque rupture have been recognized as the most frequent risk factor for thrombosis and acute coronary syndrome. Intravascular optical coherence tomography (IVOCT) can identify TCFA and assess cap thickness, which provides an opportunity to assess plaque vulnerability. We developed an automated method that can detect lipidous plaque and assess fibrous cap thickness in IVOCT images. This study analyzed a total of 4,360 IVOCT image frames of 77 lesions among 41 patients. To improve segmentation performance, preprocessing included lumen segmentation, pixel-shifting, and noise filtering on the raw polar (r, theta) IVOCT images. We used the DeepLab-v3 plus deep learning model to classify lipidous plaque pixels. After lipid detection, we automatically detected the outer border of the fibrous cap using a special dynamic programming algorithm and assessed the cap thickness. Our method provided excellent discriminability of lipid plaque with a sensitivity of 85.8% and A-line Dice coefficient of 0.837. By comparing lipid angle measurements between two analysts following editing of our automated software, we found good agreement by Bland-Altman analysis (difference 6.7+/-17 degree; mean 196 degree). Our method accurately detected the fibrous cap from the detected lipid plaque. Automated analysis required a significant modification for only 5.5% frames. Furthermore, our method showed a good agreement of fibrous cap thickness between two analysts with Bland-Altman analysis (4.2+/-14.6 micron; mean 175 micron), indicating little bias between users and good reproducibility of the measurement. We developed a fully automated method for fibrous cap quantification in IVOCT images, resulting in good agreement with determinations by analysts. The method has great potential to enable highly automated, repeatable, and comprehensive evaluations of TCFAs.

translated by 谷歌翻译

Complete identification of complex salt geometries from inaccurate migrated subsurface offset gathers using deep learning

Ana Paula O. Muller , Jesse C. Costa , Clecio R. Bom , Elisangela L. Faria , Matheus Klatt , Gabriel Teixeira , Marcelo P. de Albuquerque , Marcio P. de Albuquerque

分类：计算机视觉

2022-04-20

Delimiting salt inclusions from migrated images is a time-consuming activity that relies on highly human-curated analysis and is subject to interpretation errors or limitations of the methods available. We propose to use migrated images produced from an inaccurate velocity model (with a reasonable approximation of sediment velocity, but without salt inclusions) to predict the correct salt inclusions shape using a Convolutional Neural Network (CNN). Our approach relies on subsurface Common Image Gathers to focus the sediments' reflections around the zero offset and to spread the energy of salt reflections over large offsets. Using synthetic data, we trained a U-Net to use common-offset subsurface images as input channels for the CNN and the correct salt-masks as network output. The network learned to predict the salt inclusions masks with high accuracy; moreover, it also performed well when applied to synthetic benchmark data sets that were not previously introduced. Our training process tuned the U-Net to successfully learn the shape of complex salt bodies from partially focused subsurface offset images.

translated by 谷歌翻译

Conservation Tools: The Next Generation of Engineering--Biology Collaborations

Andrew Schulz , Cassie Shriver , Suzanne Stathatos , Benjamin Seleb , Emily Weigel , Young-Hui Chang , M. Saad Bhamla , David Hu , Joseph R. Mendelson III , .

分类：机器学习

2023-01-03

The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.

translated by 谷歌翻译